Skip to content

Add benchmark for return or = vs ||=#133

Closed
jperville wants to merge 1 commit intofastruby:mainfrom
PerfectMemory:return-or-set-vs-or-equals
Closed

Add benchmark for return or = vs ||=#133
jperville wants to merge 1 commit intofastruby:mainfrom
PerfectMemory:return-or-set-vs-or-equals

Conversation

@jperville
Copy link
Copy Markdown
Contributor

The ||= operator is commonly used to implement memoization.

This benchmark shows a much faster alternative to implement memoization: explicit return when the memoized value should be reused, with fallback to setting the memoized value.

$ ruby -v code/general/return-or-set-vs-or-equals.rb
ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux-gnu]
Warming up --------------------------------------
                 ||=    12.219k i/100ms
         return or =    16.085k i/100ms
Calculating -------------------------------------
                 ||=    127.190k (± 0.9%) i/s -    647.607k in   5.092082s
         return or =    166.137k (± 0.9%) i/s -    836.420k in   5.034955s

Comparison:
         return or =:   166137.0 i/s
                 ||=:   127190.4 i/s - 1.31x  slower

end

def value2
return @value if @value
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return @value if defined? @value is a real world use case for avoiding undefined exception.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case I can ensure that the instance variable is always defined.

If I change the test from if @value to if defined?(@value) && @value) then the benchmark results are different and ||= becomes the faster variant.

Copy link
Copy Markdown

@texpert texpert Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if defined?(@value) && @value) is the equivalent of ||=. So, maybe it makes sense for full comparison of all the use cases:

# For trying again if nil and not sure if variable is defined
1. `||=`
2. `return @value if defined?(@value) && @value)`

# For trying again if nil and sure the variable is defined
3. `return @value if @value`

# For not trying again if nil and not sure if variable is defined
4. `return @value if defined? @value`

@ixti
Copy link
Copy Markdown
Collaborator

ixti commented Sep 21, 2017

I'm not quiet sure what this benchmark is trying to show off. It's obvious that ||= doing a bit more than just =, but the purpose is different. What's the point of assigning a constant? If you will use test cases that are closer to real world - difference will be unnoticeable. Comparing = to ||= is IMO even less valuable than comparing Enumerable#each to Enumerable#map.

@jperville
Copy link
Copy Markdown
Contributor Author

jperville commented Sep 21, 2017

@ixti we got the following real world case: we are using the json-ld gem and want to frame a graph.

In the following stackprof trace, we spend 26% of the running time in 2 methods, with 12.1% in RDF::URI#value:

$ stackprof tmp/stackprof-cpu-*.dump --text --limit 2
==================================
  Mode: cpu(1000)
  Samples: 107 (0.00% miss rate)
  GC: 11 (10.28%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
        18  (16.8%)          15  (14.0%)     RDF::URI#==
        13  (12.1%)          13  (12.1%)     RDF::URI#value

Here is the code of RDF::URI#value (on github: https://github.com/ruby-rdf/rdf/blob/2.2.9/lib/rdf/model/uri.rb#L798-L806):

$ stackprof tmp/stackprof-cpu-*.dump --text --method 'RDF::URI#value'
RDF::URI#value (/home/julien/RubymineProjects/rdf/lib/rdf/model/uri.rb:798)
  samples:    13 self (12.1%)  /     13 total (12.1%)
  callers:
       9  (   69.2%)  RDF::URI#to_str
       4  (   30.8%)  RDF::URI#to_str
  code:
                                  |   798  |     def value
                                  |   799  |       @value ||= [
                                  |   800  |         ("#{scheme}:" if absolute?),
                                  |   801  |         ("//#{authority}" if authority),
                                  |   802  |         path,
                                  |   803  |         ("?#{query}" if query),
                                  |   804  |         ("##{fragment}" if fragment)
   13   (12.1%) /    13  (12.1%)  |   805  |       ].compact.join("").freeze
                                  |   806  |     end

The expensive build of the default value is only invoked once in our case (out of 10000s of calls to RDF::URI#value).

By adding an early return of the memoized variable if present, the time spent in the method went from 12% to 9% of the total, as show here:

$ stackprof tmp/stackprof-cpu-*.dump --text --method 'RDF::URI#value'
RDF::URI#value (/home/julien/RubymineProjects/rdf/lib/rdf/model/uri.rb:798)
  samples:     9 self (8.7%)  /      9 total (8.7%)
  callers:
       6  (   66.7%)  RDF::URI#to_str
       3  (   33.3%)  RDF::URI#to_str
  code:
                                  |   798  |     def value
    9    (8.7%) /     9   (8.7%)  |   799  |       return @value if @value
                                  |   800  |       @value ||= [

@jperville jperville force-pushed the return-or-set-vs-or-equals branch 2 times, most recently from a98c882 to d69d4d5 Compare September 21, 2017 13:06
@jperville
Copy link
Copy Markdown
Contributor Author

@texpert I have updated my benchmarks to take your comments into accounts, the results are interesting if we are sure that the memoized variable is defined. It seems that it is the defined? test which explains why such a big difference.

$ ruby -v code/general/return-or-set-vs-or-equals.rb
ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux-gnu]
Warming up --------------------------------------
                 ||=    12.493k i/100ms
return @value if defined?(@value) && @value)
                        11.297k i/100ms
return @value if defined?(@value)
                        11.903k i/100ms
return @value if @value
                        16.584k i/100ms
Calculating -------------------------------------
                 ||=    128.041k (± 2.7%) i/s -    649.636k in   5.077706s
return @value if defined?(@value) && @value)
                        112.480k (± 2.4%) i/s -    564.850k in   5.024635s
return @value if defined?(@value)
                        119.103k (± 2.5%) i/s -    595.150k in   5.000107s
return @value if @value
                        167.953k (± 2.3%) i/s -    845.784k in   5.038625s

Comparison:
return @value if @value:   167953.2 i/s
                 ||=:   128041.4 i/s - 1.31x  slower
return @value if defined?(@value):   119103.0 i/s - 1.41x  slower
return @value if defined?(@value) && @value):   112480.4 i/s - 1.49x  slower

@ixti
Copy link
Copy Markdown
Collaborator

ixti commented Sep 21, 2017

Oh. That's interesting. Although still a bit strange:

class Memoizer
  VALUE = "xxx"

  def initialize
    @value = nil
  end

  def return3
    return @value if defined?(@value)
    @value
  end
end

At bare minimum #return3 will always return you nil. But your real world example actually pretty interesting indeed.

@Arcovion
Copy link
Copy Markdown
Collaborator

In my tests @value || @value = VALUE is fastest

@Arcovion
Copy link
Copy Markdown
Collaborator

I'm guessing it's slightly faster as @value ||= VALUE has to check whether the instance variable exists, and assign it otherwise. @value || ... simply calls @value without checking, giving you a NameError if the variable doesn't exist.

@jperville
Copy link
Copy Markdown
Contributor Author

@Arcovion that's good enough for my use-case, since I know that the variable exist. Updating my PR with your solution as the fastest.

@ixti
Copy link
Copy Markdown
Collaborator

ixti commented Sep 21, 2017

First of all, I want to emphasize that ||= and || works ONLY if your memoized value is NON-falsey, so if your expensive computation returns nil or false, then ||= won't work for you at all. Secondly, calling @varname when it was not initialized before will cause ruby warning, the only form you can use it without defined? guard is ||=:

 class X
  def a
    @a ||= 1
  end

  def b
    @b || @b = 2
  end
end

puts X.new.a
puts X.new.b

run that with ruby -w and you will get warning: instance variable @b not initialized

@jperville jperville force-pushed the return-or-set-vs-or-equals branch from d69d4d5 to acc35f2 Compare September 21, 2017 16:11
@ixti
Copy link
Copy Markdown
Collaborator

ixti commented Sep 21, 2017

Also, benchmarks are highly affected with X.times. So this is my small take on this benchmarks:

require "benchmark/ips"

class Memoizer
  VALUE  = "some value".freeze
  CYCLES = 10_000

  def initialize
    @v1 = nil
  end

  def v11
    @v1 ||= VALUE
  end

  def v12
    return @v1 if @v1
    @v1 = VALUE
  end

  def v13
    @v1 || @v1 = VALUE
  end

  def v21
    @v2 ||= VALUE
  end

  def v22
    return @v2 if defined?(@v2)
    @v2 = VALUE
  end

  def v23
    return @v2 if instance_variable_defined?(:@v2)
    @v2 = VALUE
  end

  def self.example(name)
    obj = new

    obj.singleton_class.class_eval <<-RUBY
      alias_method :run, :#{name}
      def to_proc; proc { #{Array.new(CYCLES, "run").join(" ; ")} }; end
    RUBY

    obj
  end
end

puts "== v1x"

Benchmark.ips do |x|
  x.report("v11", &Memoizer.example(:v11))
  x.report("v12", &Memoizer.example(:v12))
  x.report("v13", &Memoizer.example(:v13))
  x.compare!
end

puts "== v2x"

Benchmark.ips do |x|
  x.report("v21", &Memoizer.example(:v21))
  x.report("v22", &Memoizer.example(:v22))
  x.report("v23", &Memoizer.example(:v23))
  x.compare!
end

with the above, results are:

                 v13:     3195.8 i/s
                 v12:     3160.5 i/s - same-ish: difference falls within error
                 v11:     1738.9 i/s - 1.84x  slower

                 v21:     1707.4 i/s
                 v22:     1565.7 i/s - 1.09x  slower
                 v23:     1305.5 i/s - 1.31x  slower

The `||=` operator is commonly used to implement memoization, but when the
memoization variable always exist some optimization is possible.
@jperville jperville force-pushed the return-or-set-vs-or-equals branch from acc35f2 to f93a559 Compare September 21, 2017 16:13

# For trying again if nil and not sure if variable is defined
def return1
return @value if defined?(@value) && @value
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no point in this test at all. @value is defined in initialize


# For not trying again if nil and not sure if variable is defined
def return3
return @value if defined?(@value)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in case of return1

jperville added a commit to PerfectMemory/rdf that referenced this pull request Sep 26, 2017
gkellogg pushed a commit to ruby-rdf/rdf that referenced this pull request Sep 26, 2017
@etagwerker
Copy link
Copy Markdown
Member

Closing this because there were comments with valid points that haven't been addressed in years.

@etagwerker etagwerker closed this Nov 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants